AITopics | fr 0

Collaborating Authors

fr 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Analyzing Political Bias in LLMs via Target-Oriented Sentiment Classification

Elbouanani, Akram, Dufraisse, Evan, Popescu, Adrian

arXiv.org Artificial IntelligenceMay-27-2025

Political biases encoded by LLMs might have detrimental effects on downstream applications. Existing bias analysis methods rely on small-size intermediate tasks (questionnaire answering or political content generation) and rely on the LLMs themselves for analysis, thus propagating bias. We propose a new approach leveraging the observation that LLM sentiment predictions vary with the target entity in the same sentence. We define an entropy-based inconsistency metric to encode this prediction variability. We insert 1319 demographically and politically diverse politician names in 450 political sentences and predict target-oriented sentiment using seven models in six widely spoken languages. We observe inconsistencies in all tested combinations and aggregate them in a statistically robust analysis at different granularity levels. We observe positive and negative bias toward left and far-right politicians and positive correlations between politicians with similar alignment. Bias intensity is higher for Western languages than for others. Larger models exhibit stronger and more consistent biases and reduce discrepancies between similar languages. We partially mitigate LLM unreliability in target-oriented sentiment classification (TSC) by replacing politician names with fictional but plausible counterparts.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.19776

Country:

North America > United States (1.00)
Europe > France (1.00)
Asia (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government > France Government (0.67)
Media > News (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons

Kojima, Takeshi, Okimura, Itsuki, Iwasawa, Yusuke, Yanaka, Hitomi, Matsuo, Yutaka

arXiv.org Artificial IntelligenceApr-2-2024

Current decoder-based pre-trained language models (PLMs) successfully demonstrate multilingual capabilities. However, it is unclear how these models handle multilingualism. We analyze the neuron-level internal behavior of multilingual decoder-based PLMs, Specifically examining the existence of neurons that fire ``uniquely for each language'' within decoder-only multilingual PLMs. We analyze six languages: English, German, French, Spanish, Chinese, and Japanese, and show that language-specific neurons are unique, with a slight overlap (< 5%) between languages. These neurons are mainly distributed in the models' first and last few layers. This trend remains consistent across languages and models. Additionally, we tamper with less than 1% of the total neurons in each model during inference and demonstrate that tampering with a few language-specific neurons drastically changes the probability of target language occurrence in text generation.

intervention, neuron, text generation, (15 more...)

arXiv.org Artificial Intelligence

2404.02431

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback

Can Machine Translation Bridge Multilingual Pretraining and Cross-lingual Transfer Learning?

Ji, Shaoxiong, Mickus, Timothee, Segonne, Vincent, Tiedemann, Jörg

arXiv.org Artificial IntelligenceMar-25-2024

Multilingual pretraining and fine-tuning have remarkably succeeded in various natural language processing tasks. Transferring representations from one language to another is especially crucial for cross-lingual learning. One can expect machine translation objectives to be well suited to fostering such capabilities, as they involve the explicit alignment of semantically equivalent sentences from different languages. This paper investigates the potential benefits of employing machine translation as a continued training objective to enhance language representation learning, bridging multilingual pretraining and cross-lingual applications. We study this question through two lenses: a quantitative evaluation of the performance of existing models and an analysis of their latent representations. Our results show that, contrary to expectations, machine translation as the continued training fails to enhance cross-lingual representation learning in multiple cross-lingual natural language understanding tasks. We conclude that explicit sentence-level alignment in the cross-lingual scenario is detrimental to cross-lingual transfer pretraining, which has important implications for future cross-lingual transfer studies. We furthermore provide evidence through similarity measures and investigation of parameters that this lack of positive influence is due to output separability -- which we argue is of use for machine translation but detrimental elsewhere.

es 0, es fr ru, fr 0, (15 more...)

arXiv.org Artificial Intelligence

2403.16777

Country:

Europe > Finland > Uusimaa > Helsinki (0.04)
North America > Dominican Republic (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Rainproof: An Umbrella To Shield Text Generators From Out-Of-Distribution Data

Darrin, Maxime, Piantanida, Pablo, Colombo, Pierre

arXiv.org Artificial IntelligenceNov-1-2023

Implementing effective control mechanisms to ensure the proper functioning and security of deployed NLP models, from translation to chatbots, is essential. A key ingredient to ensure safe system behaviour is Out-Of-Distribution (OOD) detection, which aims to detect whether an input sample is statistically far from the training distribution. Although OOD detection is a widely covered topic in classification tasks, most methods rely on hidden features output by the encoder. In this work, we focus on leveraging soft-probabilities in a black-box framework, i.e. we can access the soft-predictions but not the internal states of the model. Our contributions include: (i) RAINPROOF a Relative informAItioN Projection OOD detection framework; and (ii) a more operational evaluation setting for OOD detection. Surprisingly, we find that OOD detection is not necessarily aligned with task-specific measures. The OOD detector may filter out samples well processed by the model and keep samples that are not, leading to weaker performance. Our results show that RAINPROOF provides OOD detection methods more aligned with task-specific performance metrics than traditional OOD detectors.

detection, detector, mean 0, (13 more...)

arXiv.org Artificial Intelligence

2212.09171

Country:

Oceania > Australia (0.04)
Europe > Austria (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
(6 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models

Rust, Phillip, Søgaard, Anders

arXiv.org Artificial IntelligenceAug-17-2023

Language models such as mBERT, XLM-R, and BLOOM aim to achieve multilingual generalization or compression to facilitate transfer to a large number of (potentially unseen) languages. However, these models should ideally also be private, linguistically fair, and transparent, by relating their predictions to training data. Can these requirements be simultaneously satisfied? We show that multilingual compression and linguistic fairness are compatible with differential privacy, but that differential privacy is at odds with training data influence sparsity, an objective for transparency. We further present a series of experiments on two common NLP tasks and evaluate multilingual compression and training data influence sparsity under different privacy guarantees, exploring these trade-offs in more detail. Our results suggest that we need to develop ways to jointly optimize for these objectives in order to find practical trade-offs.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2308.08774

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
North America > Dominican Republic (0.04)
(22 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

Leveraging Label Variation in Large Language Models for Zero-Shot Text Classification

Plaza-del-Arco, Flor Miriam, Nozza, Debora, Hovy, Dirk

arXiv.org Artificial IntelligenceJul-24-2023

The zero-shot learning capabilities of large language models (LLMs) make them ideal for text classification without annotation or supervised training. Many studies have shown impressive results across multiple tasks. While tasks, data, and results differ widely, their similarities to human annotation can aid us in tackling new tasks with minimal expenses. We evaluate using 5 state-of-the-art LLMs as "annotators" on 5 different tasks (age, gender, topic, sentiment prediction, and hate speech detection), across 4 languages: English, French, German, and Spanish. No single model excels at all tasks, across languages, or across all labels within a task. However, aggregation techniques designed for human annotators perform substantially better than any one individual model. Overall, though, LLMs do not rival even simple supervised models, so they do not (yet) replace the need for human annotation. We also discuss the tradeoffs between speed, accuracy, cost, and bias when it comes to aggregated model labeling versus human annotation.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.12973

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Maryland > Baltimore (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Linear Cross-Lingual Mapping of Sentence Embeddings

Vasilyev, Oleg, Isono, Fumika, Bohannon, John

arXiv.org Artificial IntelligenceMay-23-2023

Semantics of a sentence is defined with much less ambiguity than semantics of a single word, and it should be better preserved by translation to another language. If multilingual sentence embeddings intend to represent sentence semantics, then the similarity between embeddings of any two sentences must be invariant with respect to translation. Based on this suggestion, we consider a simple linear cross-lingual mapping as a possible improvement of the multilingual embeddings. We also consider deviation from orthogonality conditions as a measure of deficiency of the embeddings.

artificial intelligence, natural language, transformation, (15 more...)

arXiv.org Artificial Intelligence

2305.14256

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Dominican Republic (0.04)
Europe > Slovenia (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.99)

Add feedback

A Bayesian approach to multi-task learning with network lasso

Shimamura, Kaito, Kawano, Shuichi

arXiv.org Machine LearningOct-18-2021

Network lasso is a method for solving a multi-task learning problem through the regularized maximum likelihood method. A characteristic of network lasso is setting a different model for each sample. The relationships among the models are represented by relational coefficients. A crucial issue in network lasso is to provide appropriate values for these relational coefficients. In this paper, we propose a Bayesian approach to solve multi-task learning problems by network lasso. This approach allows us to objectively determine the relational coefficients by Bayesian estimation. The effectiveness of the proposed method is shown in a simulation study and a real data analysis.

coefficient, fr 0, network lasso, (14 more...)

arXiv.org Machine Learning

2110.0904

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (0.50)

Industry: Education > Focused Education > Special Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization

Piché, Alexandre, Marino, Joseph, Marconi, Gian Maria, Pal, Christopher, Khan, Mohammad Emtiyaz

arXiv.org Machine LearningJun-7-2021

Target networks are at the core of recent success in Reinforcement Learning. They stabilize the training by using old parameters to estimate the $Q$-values, but this also limits the propagation of newly-encountered rewards which could ultimately slow down the training. In this work, we propose an alternative training method based on functional regularization which does not have this deficiency. Unlike target networks, our method uses up-to-date parameters to estimate the target $Q$-values, thereby speeding up training while maintaining stability. Surprisingly, in some cases, we can show that target networks are a special, restricted type of functional regularizers. Using this approach, we show empirical improvements in sample efficiency and performance across a range of Atari and simulated robotics environments.

latexit sha1, regularization, target network, (13 more...)

arXiv.org Machine Learning

2106.02613

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Classification of EEG Signal based on non-Gaussian Neutral Vector

Ma, Zhanyu

arXiv.org Machine LearningAug-2-2018

In the design of brain-computer interface systems, classification of Electroencephalogram (EEG) signals is the essential part and a challenging task. Recently, as the marginalized discrete wavelet transform (mDWT) representations can reveal features related to the transient nature of the EEG signals, the mDWT coefficients have been frequently used in EEG signal classification. In our previous work, we have proposed a super-Dirichlet distribution-based classifier, which utilized the nonnegative and sum-to-one properties of the mDWT coefficients. The proposed classifier performed better than the state-of-the-art support vector machine-based classifier. In this paper, we further study the neutrality of the mDWT coefficients. Assuming the mDWT vector coefficients to be a neutral vector, we transform them non-linearly into a set of independent scalar coefficients. Feature selection strategy is proposed on the transformed feature domain. Experimental results show that the feature selection strategy helps improving the classification accuracy.

artificial intelligence, data quality, machine learning, (17 more...)

arXiv.org Machine Learning

1808.00814

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback